Spectral Analysis of Protein-Protein Interactions in Drosophila melanogaster 
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Within a case study on the protein-protein interaction network (PIN) of Drosophila melanogaster 
we investigate the relation between the network's spectral properties and its structural features such 
as the prevalence of specific subgraphs or duplicate nodes as a result of its evolutionary history. The 
discrete part of the spectral density shows fingerprints of the PIN's topological features including 
a preference for loop structures. Duplicate nodes are another prominent feature of PINs and we 
discuss their representation in the PIN's spectrum as well as their biological implications. 

PACS numbers: 89.75.-k, 89.20.-a, 89. 75. He, 89.75.Fb, 87.16.Yc, 87.16.-b, 87.10.+e, 02.50.Fz 



I. INTRODUCTION 

Network structures can be observed in most diverse do- 
mains ranging from biological and technological systems 
to social or economical systems pj. Genetic regula- 
tory networks, protein-protein interaction networks and 
metabolic networks support the functions of life in any 
living organism. Technological networks such as the in- 
ternet or the World Wide Web have a huge impact on our 
lives and societies. Networks of acquaintances and the ex- 
change of information within these networks shape social 
and economical systems. Considering the omnipresence 
of networks, their investigation has a long tradition in 
graph theory @, Q- However, during the last few years 
high quality data on real-world networks has revealed 
that they cannot be adequately described by standard 
models from random graph theory and the topic has at- 
tracted growing interest. Still, much attention has been 
devoted to the derivation of rather specific quantities 
like degree distributions or clustering coefficients that 
do not allow for a classification and understanding of 
network topologies within a broader and self-consistent 
framework. 

Making an attempt towards a more comprehensive de- 
scription spectral graph theory 0, IE S can be con- 
sidered as one promising ansatz. A network of N nodes 
can be described by its adjacency matrix A = (ay) with 
entries 



1 if there is a link between node i and j 

otherwise. ^ ' 



The adjacency matrix is a symmetric, non-negative ma- 
trix in the case of undirected networks and accordingly 
has real eigenvalues j = 1,...,N, being solutions 



of det(A — AI) = 0. The relation between features of a 
network and properties of its spectral density 



1 N 

3=1 



(2) 



with respect to its adjacency matrix is a topic of current 
research. While dense classical random networks exhibit 
a semi-circular spectral density of the adjacency matrix 
0, networks with a broad or scale free degree distri- 
bution give rise to a broader spectrum 0, [H], 0, 0, 
IriL ITo| . A striking feature of sparse random networks' 
spectral density is the emergence of peaks at eigenvalues 
of finite trees |9j , similar to those found in large random 
trees , due to the strong prevalence of these subgraphs 
■ Here we address whether these findings are applica- 
ble more generally, that is whether peaks in the spectral 
density of sparse random networks can be associated to 
a strong prevalence of specific subgraphs. The search 
for subgraphs that are statistically overrepresented rel- 
ative to a null-model, so-called motifs, recently gained 
much attention 0, 0] . As a case study on the relation 
between these two approaches, we investigate the spec- 
tral properties of the protein-protein interaction network 
(PIN) of the fruit fly Drosophila melanogaster [l9l | . While 
no simple correspondence between network motifs and a 
network's spectral proprieties can be derived, on a more 
abstract level, we infer from the PIN's spectrum a preva- 
lence of loop structures. Furthermore, some properties 
specific to a network that has evolved by duplication of 
nodes are studied and discussed within the context of 
spectral analysis. 



II. THE SPECTRUM OF THE PIN OF 
DROSOPHILA MELANOGASTER 



For our study we used the PIN of Drosophila 
"Electronic address: jmMTlchristelkamp.de, c.kamp@imperial.ac.uk| melanogaster as given in [l<| andavailable via the 
tElectronic address: k.christensen@imperial.ac.uk Database of Interacting Proteins [20|. The protein- 
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FIG. 1: The number of nodes divided by 10000 (solid line) and 
the fraction of nodes (dashed line) in the largest connected 
component in the PIN as a function of the minimal confidence 
value of protein-protein interaction. We focus on the PIN 
defined by a minimal confidence value of 0.5, see dashed line. 
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FIG. 2: Spectral analysis of the protein-protein interaction 
network of Drosophila melanogaster. (a) The cumulative spec- 
tral density, (b) The discrete frequency spectrum containing 
49% of all eigenvalues. 



protein interactions have been derived using the two- 
hybrid method which, however, is known to generate 
many false positives. Therefore each interaction in the 
network is classified by a confidence value between zero 
and one defining a hierarchy of networks with increasing 
minimal confidence value for the protein-protein interac- 
tions. In Fig. ^the size of the largest connected compo- 
nent in a network with a given minimal confidence value 
of interactions is shown. 

For our further analysis we choose a network with a min- 
imal confidence value of 0.5 which contains 4681 proteins 
and 4794 interactions corresponding to an average de- 
gree (k) = 2.05. The network is enriched with biologi- 
cally meaningful interactions while it still shows a strong 
largest connected component (i.e. a giant component) 
containing about 2/3 of its nodes. 

We determined the eigenvalues of the adjacency matrix 
corresponding to this PIN. The cumulative spectral den- 
sity in Fig. |21 (a) exhibits jumps at various eigenvalues 
which are represented by the discrete spectrum in Fig. 
12 (b) [3i| ■ Since about 2 /3 of the network's nodes be- 
long to its giant component and 49% of the eigenvalues 
in the network's spectrum are in the discrete spectrum, 
the emergence of spectral peaks cannot be explained by 
small isolated clusters alone. 



A. The discrete spectrum and network motifs 

To get a better understanding of the emergence of spec- 
tral peaks we compare the discrete spectrum with the 
corresponding spectra of two reference networks. First, 
we look at a network of the same size and degree sequence 
but randomized links following the procedure of |2lj| (a 
randomized PIN). Second, we consider a classical random 
network of the same size and average degree (k) = 2.05 
(a random network), that is a network with a probabil- 
ity p = 0.000438 for a link between any two nodes. In 
Fig. |2| the discrete spectrum of the adjacency matrix 
of the protein-protein interaction network of Drosophila 
melanogaster as well as of the two reference networks are 
shown, the latter being averages over 10 reference net- 
works. 

To get more reliable results, we concentrate our further 
analysis only on eigenvalues that can be found more than 
twice in the spectrum of the original network. Qualita- 
tively, we see that while the classical random network 
shows only a few peaks of that size corresponding to the 
eigenvalues of simple tree-graphs (2.1, 3.1, 4.1, 4.2. in 
Fig. |3} , additional eigenvalues appear in the discrete 
spectrum of the randomized PIN with the same degree 
sequence as the original network and eventually the orig- 
inal network (see Fig. This change in the spectral 
properties indicates some differences in the structural or- 
ganization of the underlying networks. In the following 
paragraphs, we will discuss how the observed hierarchy 
of spectral peaks reflects the networks' topologies and re- 
lates to other concepts like the search for motifs. 
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FIG. 3: The discrete frequency spectrum of (a) the PIN of 
Drosophila melanogaster containing 49% of all eigenvalues, 
(b) a randomized PIN with identical degrees at each node 
containing 43% of all eigenvalues, and (c) a classical random 
network of identical size and average degree (k) — 2.05 con- 
taining 27% of all eigenvalues. 



Following the arguments of [ly, we suggest that the 
prevalence of specific peaks in the discrete spectrum of 
a network corresponds to a strong representation of cer- 
tain subgraphs. It has recently been shown that net- 



works from different contexts show characteristic over- 
representation of specific subnetworks which are usually 
referred to as motifs 0, ■ Although motifs can be ex- 
pected to leave marks in a network's spectrum, there is 
seemingly no simple correspondence between the eigen- 
values of small subgraphs and spectral peaks. First, sub- 
graphs are not generally represented by their eigenvalues 
in the spectrum of the whole network. Second, isospec- 
tral graphs are not necessarily isomorphic |23| . Neverthe- 
less, a thorough comparative study of the discrete spec- 
trum can provide some insight into the networks' struc- 
ture. In Figs. 01 and [S] we show the connected subgraphs 
up to size 5 with the full set of eigenvalues present in the 
discrete spectrum of the whole network. It shows that 
the spectrum of the PIN is more consistent with loop- 
structures (cf. graphs 3.2, 4.3, 5.4, 5.5, 5.6, 5.8) than 
any randomized version which might hint to regulatory 
functionality supplied by this network. The eigenvalues 
behind these structures might correspond to eigenvalues 
of trees, e.g., the eigenvalues of a triangle (graph 3.2) 
or the box (graph 4.3, often also referred to as bi-fan 
structure) might well be explained by graphs 2.1 and 
5.3. However, to represent the eigenvalues of graphs 5.5 
and 5.6 one has to consider trees of minimum size 8 and 
7, respectively. The eigenvalues of graph 5.8 cannot be 
found among trees of size up to 10. Considering that the 
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FIG. 4: Connected subgraphs with up to 4 nodes: A bullet 
(•) in the three middle columns denotes that the eigenvalues 
of this graph can be found in the spectrum of the original 
network (PIN), the randomized network (Rand. PIN) or the 
random network (Rand, network), respectively. The right- 
most column shows whether the subgraph is a motif accord- 
ing to the mf inder software (default settings, [l7ll22j|1. White 
bullets (o) correspond to single eigenvalue occurrences. 
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FIG. 5: Connected subgraphs with 5 nodes: A bullet (•) in 
the three middle columns denotes that the eigenvalues of this 
graph can be found in the spectrum of the original network 
(PIN), the randomized network (Rand. PIN) or the random 
network (Rand, network), respectively. The rightmost col- 
umn shows whether the subgraph is a motif according to the 
mf inder software (default settings, |l7l l2^|1. White bullets 
(o) correspond to single eigenvalue occurrences. 



frequency of a given tree of size n in a sparse network de- 
creases exponentially with n [tj and relating the findings 
in the PIN to those in the randomized reference networks 
we hypothesize that the spectral peculiarities reflect the 
loop structure in the original network. 
To quantify the correspondence between the number of 
specific subgraphs in the PIN and the PIN's discrete spec- 
trum we tried to decompose the spectrum into the con- 
tributions of connected subgraphs up to size 5. This, 
however, was not feasible indicating that higher order 
contributions, though being individually small, cannot 
be neglected as a whole. 

Although we have to ascertain that there is no simple 
correspondence between subgraphs of a network and the 
prevalence of their eigenvalues in the discrete spectrum 
of the whole network we want to discuss the relation of 
spectral properties to the notion of motifs. According 
to the definition introduced in Ref. [Tj], a motif is a 
subnetwork that shows strong prevalence within the net- 
work relative to a randomized network. For our analy- 
sis we refer to the default req uirements implemented in 
the mf inder software [ttI |22| . that is a motif is a sub- 
graph that occurs at least by two standard deviations 
more than in 100 randomized networks with the same 
degree sequence. In Figs. 0]and[S]the rightmost columns 
show which connected subgraphs up to size 5 are motifs 
in the PIN according to these criteria. There exist a lot 
of highly connected motifs while the spectrum reflects 
more the tree-structures in the network. However, the 
fingerprints of trees in the spectrum of both the original 
and the randomized PIN are consistent with the fact that 
they do not show up as motifs according to the above def- 
inition. One might further speculate whether some mo- 
tifs are hidden for spectral analysis because they are in 
fact building blocks of larger units. For example graph 
5.5 as well as its subgraph 4.4 is a motif according to 
(l7tl2^ . But only the eigenvalues of 5.5 can be found in 
the spectrum of the PIN. Moreover highly connected mo- 
tifs do not occur in high (absolute) numbers and might 
accordingly be drowned in spectral analysis. 

B. The circuitry of the PIN 

In section lll Al we have shown that the discrete spectrum 
of the PIN of Drosophila melanogaster favors the eigen- 
values of loopy subgraphs. This observation derived from 
the investigation of distinct local structures and their 
eigenvalue representations can be confirmed by an as- 
sessment of the whole set of eigenvalues. Evaluating the 
trace of the matrix A fc 

JV 

Tr(A k ) = Y, X * ( 3 ) 

i=l 

yields the number of directed loops of length k in the un- 
derlying network 0, llOj as shown in Fig. though ne- 
glecting details of the graphs underlying the loops. Note, 
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FIG. 6: (a) The frequency of loops of size k in the PIN of 
Drosophila melanogaster (solid line), a randomized PIN with 
identical degrees at each node (dashed line) and a classical 
random network of identical size and average degree (k) — 
2.05 (dotted line). Odd cycles represents non-trivial loops, 
that is, deviations from a tree-like structure in the networks, 
(b) The relative frequency of loops of size k in the PIN of 
Drosophila melanogaster with respect to the randomized PIN 
(dashed line) and the classical random network (dotted line) . 



that even loops might be trivial going back and forth in 
a tree while odd loops are non-trivial. The difference 
in growth rates of the numbers of loops of growing size 
between the original network and the classical random 
network is likely due to the strong fragmentation of the 
latter one (many isolated nodes). However, the strong 
relative prevalence of loops of odd length in the origi- 
nal network is more remarkable with respect to the net- 
works' topologies. This becomes more obvious from Fig. 
El (b) showing the number of loops of a given size in the 
original PIN normalized to the numbers in the two refer- 
ence networks. While tree graphs only have trivial loops 
of even length, loops of odd length indicate non-trivial 
loops which confirms the results derived from the evalu- 
ation of the discrete spectrum on the basis of eigenvalue 
representations of small subgraphs. 



The analysis of a network's discrete spectrum could re- 
veal some structural information about the network as a 
whole. This information is less specific than an analysis 
in terms of motifs, that is only conclusions about more 
general properties like the prevalence of loops are possi- 
ble instead of exact motif counts. However, it should be 
emphasized that spectral analysis is not hampered by an 
a priori bias towards predefined quantities like motifs of a 
given size. It is a challenging question of future research 
to investigate the relationship between a network's spec- 
trum and its topological features, e.g. in terms of motifs, 
in more detail to get a more rigorous and unbiased char- 
acterization of a network's topological features. 



III. FINGERPRINTS OF DUPLICATION 

The evolution of many biological networks and specifi- 
cally PINs is assumed to be strongly driven by duplica- 
tion (and diversification) of nodes in the network [24 . l25j . 
The genomes underlying the PIN of many organisms 
have undergone a few whole genome duplications com- 
plemented by many single-gene duplications (2(|. After 
duplication, one of the duplicates usually diverges from 
its original appearance, possibly providing new function- 
ality. The concept of duplication has similarly been rec- 
ognized to be important for functional roles in a network 
motif p?j . The search for fingerprints of the evolutionary 
history of a PIN naturally has to include an assessment of 
duplicate nodes, that is those that share the same inter- 
action partners. Each set of duplicate nodes represents 
an equivalence class also referred to as an orbit. The re- 
duced network is a network in which all nodes of an orbit 
are reduced to one node. 





Duplicate nodes 


Duplicate links 


PIN D. melanogaster 
Randomized PIN 
Random network 


686 

626.0 ± 22.0 
151.0 ± 14.2 


728 

629.1 ± 22.0 
151.0 ± 14.2 



TABLE I: The table shows the number of nodes that are du- 
plicates (duplicate nodes) and the number of neighbors asso- 
ciated to these nodes (duplicate links) in the original network 
and the two reference networks, that is the difference in the 
number of nodes and links between the original network and 
the reduced network. Isolated nodes have been neglected. 

Tab. U shows that the PIN has more duplicate nodes 
(with associated links) than the reference networks. Fig. 
0shows the frequency of orbits of a given size in the origi- 
nal as well as in the reference networks. The distribution 
of orbit sizes in the original network is very close to the 
one found in the randomized PIN with the same degree 
sequence, but much broader than that of the classical 
random network. 

Again, spectral analysis offers a complementary approach 
to the topic. Using the results in Appendix A, we deter- 
mined the eigenvalues of those graphs that arise from 
duplication of the two simplest reduced graphs: a line 
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Orbit size 



FIG. 7: The frequency of orbit size in the PIN of Drosophila 
melanogaster (solid line), a randomized PIN with identical 
degrees at each node (dashed line) and a classical random 
network of identical size and average degree (k) = 2.05 (dot- 
ted line). Isolated nodes have been neglected. 



and a triangle (graphs 2.1 and 3.2 in Fig. |3}. We allowed 
for up to ten duplications of each node of the reduced 
network and searched for the eigenvalues of the resulting 
subgraphs. However, spectral analysis is only consistent 
with the emergence of star graphs and the original tri- 
angle as well as the box or bi-fan structure (graph 4.3 in 
Fig. B). 

Considering the representation of star graphs in the spec- 
trum of the PIN one might guess that the high frequency 
of large orbits mainly reflects nodes with many leaves. A 
look at the joint distribution of the size of an orbit and 
the degree of its nodes in the original and the reference 
networks supports this hypothesis. In 100 reference net- 
works (of both kind) the nodes in an orbit larger than 
one have degree one, that is only nodes with degree one 
have duplicates. Only in extremely rare cases do nodes 
with degree two have a single duplicate. 
This matches the global situation in the original net- 
work, however, there are some remarkable exceptions 
with nodes of high degree in large orbits shown in Fig. 
[S] that cannot be found in the reference networks. This 
is also well in accordance with the values found in Tab. 
[I] Different from the original network, in both reference 
networks the number of duplicate links is practically the 
same as the number of duplicate nodes. 
From the Database of Interacting Proteins [2(j and Fly- 
Base [28] we derived names and descriptions (if avail- 
able) for the proteins in Fig. [SJas shown in Appendix B. 
We find that duplicate proteins are likely to have similar 
functionality in accordance with results in the yeast PIN 




FIG. 8: Subgraphs of nodes that form orbits of size > 2 and 
that have degree > 2 (black) , orbits of size 2 of nodes with de- 
gree 2 have been omitted. The white nodes are the neighbors 
of duplicate nodes that may have more neighbors than shown. 
The nodes' labels are their identifiers from the Database of 
Interacting Proteins |2(|, cf. Appendix B for more details. 



IV. SUMMARY AND CONCLUSIONS 

Recent developments in the research on complex net- 
works have brought up a better understanding of a net- 
work's topology and its connection to functionality. How- 
ever, a comprehensive theory of networks incorporating 
classical graph theory as well as recent findings into a self- 
consistent framework has still to be worked out. Consid- 
ering spectral graph theory to be a promising ansatz for 
this attempt, we have done a case study on the PIN of 
Drosophila melanogaster. The eigenvalues of a network's 
adjacency matrix (and of related matrices) provide infor- 
mation about a network's structural properties like the 
number of connected components, its diameter or char- 
acteristics of its degree distribution. Here, we have put 
special emphasis on the investigation of the discrete spec- 
trum of a sparse network relating it to prevalent substruc- 
tures. Although it will probably not be possible to derive 
the densities of specific subgraphs from the spectrum of 
a network we could show that structural prevalences on 
a more abstract level are reflected in the (discrete) spec- 
trum of the PIN under investigation. While we here fo- 
cused on the appearance of loops in subgraphs as well as 
the whole PIN future analysis might reveal further topo- 
logical features. 

Considering the evolutionary history of PINs we also dis- 
cussed the appearance of proteins that share their neigh- 
bors together with the fingerprints of these structures 
that can be found in the network's spectrum. Studying 
structures of duplicate proteins in more details we find 
that they often have close functional relationships in ac- 
cordance with earlier findings in yeast. 
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The requirement applied here for the members of an or- 
bit to show exactly the same neighborhood is very re- 
strictive, though required to allow for transitivity. This 
might be generalized by the definition of a similarity mea- 
sure that quantifies the overlap of the neighborhoods of 
two nodes. This similarity measure can be defined as a 
distance measure between nodes and the application of a 
clustering algorithm in the associated metric space might 
give further insight into local structures. 
This case study shows that a more systematic assessment 
of the relation between a network's spectral and topolog- 
ical properties has to be a topic of future research. It is a 
challenging task, however, it can bring important insight 
into a network's structure in a less biased and more sys- 
tematic way than currently available. 
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Appendix A 

Let A be a N x N matrix representing an undirected 
graph, i.e. a symmetric matrix with entries Oy € {0, 1} 
and an = 0. Let D be the matrix that is obtained after to 
perfect duplications of nodes or in other words by to du- 
plications of rows and columns, respectively. Let 
I € {1, —,N} the number (identifier) of (mutually differ- 
ent) nodes that have been duplicated and mj 1 ,...,mi ! be 
the corresponding number of duplications per node with 
to = Y^j=i m ij ■ Let A^ the matrix A but with the ele- 
ment a,. j. replaced by a,-.,. + A. Analogously, the matrix 
Ajj^.j, corresponds to the matrix A but with a,*^ ,...,0,^ 
replaced by a^^ + A,... a^j, + A. Let furthermore I be 
the identity matrix. Then the following equation holds 



det(D - AI) = 
(-A) m dct(A- AI) 

(-A) m rn ir det(A ir - AI) 



(4) 



r<l 



(~ A ) m E E mi r™ii d et(A Mj - AI) 

r<l — l r<j<l 

(- A ) m E E E m< '- mi i m *» det(Ai 

r</-2 r<j<l-l j<s<l 



AI) 



M) m E E ••• E 



ly det(Ai 



AI). 



Note that the 
m il ...m il det(Aj 1 ... i 



last term 
- AI). It 



is equivalent to 
gets obvious from 



this formula that perfect duplication of nodes only adds 
zeros to the spectrum of the graph. 

Equation (J2J can be proven by induction. Considering a 
graph with adjacency matrix A in which an arbitrary 
node i is duplicated to times leading a duplication 
matrix D one can show that 

det(D-AI) = (-A) m [dct(A-AI) + mdet(A i -AI)]. (5) 

After validating the case of I — 0, to — of equation (JIJ 
we do the induction by evaluation of the adjacency ma- 
trix D of a graph generated from the duplication graph 
represented by D by duplicating (a non-duplicate) node 
ii + i irti l+1 times. Therefore, we apply JSJ 

det(D - AI) 

= (-A)™ +1 [det(D - AI) + m ll+1 det(D il+1 - AI)] 

and derive det(D — AI) and det(D; i+1 — AI) using the 

assumption @ yielding the formula for to + to,- !+1 

duplications of I + 1 mutually different nodes. 

As an example, this formula is applied to the 2 x 2-matrix 

A corresponding to two connected nodes. Then, i\ = 1, 

%2 = 2 and one gets for the matrix D after m = mi + m-2 

duplications: 

det(D-AI) = (-A) m (A 2 -l-mi-TO 2 -TOiTO 2 ) 
Ai ;2 = ±vl + to + mim 2 . 

Translating this into the number of nodes per orbit u, = 
to,; + 1 leads to the eigenvalues 



Al; 2 = ±^711^2. 



Appendix B 



The following tables contain the information on the pro- 
teins shown in Fig. 00 extracted from the Database of 
Interacting Proteins [2(| and FlyBase |28|. 



DIP ID 



Protein name/description 



r<l r<j<2 x<y<l 



DIP:17489N CG11719-PA open reading frame, Mst98Ca, 

(Male-specific RNA 98Ca) 
DIP:17490N CG18396-PA open reading frame, Mst98Cb, 

(Male-specific RNA 98Cb) 
DIP:17144N CG4015-PA open reading frame, Fcp3C, 

(Follicle cell protein 3C) 
DIP:18312N CG17777-PA open reading frame 
DIP:17051N CG17666-PA open reading frame 
DIP:17123N CG15781-PA open reading frame 
DIP:17121N CG15032-PA open reading frame 
DIP:17122N CG15489-PA open reading frame 
DIP:20125N CG1981-PA open reading frame, Thdl, 

G/T-mismatch-specific-thymine-DNA- 

glycosylase, double-stranded DNA-binding, 

mismatch repair 
DIP:20084N CG13363-PA open reading frame 
DIP:19658N CG12212-PA open reading frame, peb, 

(pebbled), 

transcription factor activity 
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DIP:17485N CG10154-PA open reading frame, 

structural constituent of peritrophic 
membrane, (sensu Insecta) 

TABLE II: Proteins found in the 2-orbit with nodes of de- 
gree 10, both duplicates (bold) are male specific RNA (with 
corresponding polypeptides). 



DIP ID 



Protein name/description 



DIP:18704N CG2789-PA open reading frame, 
benzodiazepine receptor activity, 
transporter activity, 
metabolism and transport 
DIP:18703N CG1341-PA open reading frame, Rptl, 
endopeptidase activity, ATPase activity, 
proteolysis and peptidolysis 
CG3173-PA open reading frame 
CG12096-PA open reading frame 
CG10694-PA open reading frame 
damaged DNA-binding, base excision repair 



DIP:21261N 
DIP:20398N 
DIP:17864N 



DIP:20457N CG12405-PA open reading frame, Prx2540-1, 
(Peroxiredoxin 2540), 
peroxidase, antioxidant activity, 
defense response, oxygen species metabolism 

DIP:20458N CG12896-PA open reading frame, 
peroxidase activity 

defense response, oxygen species metabolism 
CG9624-PA open reading frame 
CG5576-PA open reading frame, imd, 
(immune deficiency), 
antimicrobial humoral response, 
(sensu Invertebrata) 
CG12470-PA open reading frame 



DIP:20623N 
DIP:17811N 



DIP:17346N 



DIP:18389N CG18779-PA open reading frame 
DIP:18387N open reading frame CG-10530/4-PA, 

Lcp65Agl/Lcp65Ag2 protein, 

(larval cuticle protein), 

structural constituent of larval cuticle, 

(sensu Insecta), 

larval cuticle biosynthesis, 

(sensu Insecta) 
DIP:18392N CG2082-PA open reading frame, 

signal transduction 
DIP:18391N CG16978-PA open reading frame 
DIP:18390N CG12907-PA open reading frame 

TABLE III: Proteins found in a 2-orbit with nodes of degree 
3, lines separate different orbits, bold proteins are duplicates. 



DIP ID 



Protein name/description 



DIP:20847N CG8284-PA open reading frame, UbcD4, 
(Ubiquitin conjugating enzyme 4), 
ubiquitin conjugating enzyme activity, 
ligase activity, 

protein metabolism, ubiquitin cycle 
DIP:23198N CG30344-PA open reading frame 
DIP:18933N CG10862-PA open reading frame, 

ubiquitin conjugating enzyme activity, 

ligase activity, 

protein metabolism 



DIP:18935N CG8974-PA open reading frame, 
transcription regulatory activity, 
"nucleo-metabolism" , transcription, 

DIP:18934N CG32581-PA open reading frame, 
transcription regulatory activity, 
"nucleo-metabolism" , transcription 

DIP:20125N CG1981-PA open reading frame, Thdl, 
G/T-mismatch-specific-thymine-DNA- 
glycosylase, double-stranded DNA-binding, 
mismatch repair 

DIP:19658N CG12212-PA open reading frame, peb, 
(pebbled), 

transcription factor activity 
DIP:20084N CG13363-PA open reading frame 
DIP:17489N CG11719-PA open reading frame, Mst98Ca, 

(Male-specific RNA 98Ca) 
DIP:17490N CG18396-PA open reading frame, Mst98Cb, 

(Male-specific RNA 98Cb) 

TABLE IV: Proteins found in a 3-orbit with nodes of degree 
2, lines separate different orbits, bold proteins are duplicates. 
Note, that Mst98Ca and Mst98Cb form a 2-orbit of degree 
10, too (cf. Tab. HJ. 



Protein name/description 



DIP ID 

DIP:19548N CG31366/18743-PA open reading frame, 
Hsp70A, (Heat shock protein 70A), 
heat, defense response, 
protein complex assembly and folding 
DIP:19549N CG31449/31359/6489-PA open reading frame, 
Hsp70B, (heat shock protein 70B), 
heat, defense response, 
protein complex assembly and folding 
DIP:19551N open reading frame CG31449-PA, Hsp70Ba, 
(heat shock protein 70Ba), 
heat, defense response, 
protein complex assembly and folding 
DIP:19552N CG5834-PA open reading frame, Hsp70Bbb, 

(heat shock protein 70Bbb) 
DIP:18493N CG7945-PA open reading frame, 

chaperone activity 
DIP:20272N CG5203-PA open reading frame, CHIP, 
chaperone activity, protein folding 
and metabolism 
DIP:20308N CG32130-PA open reading frame 
DIP:18578N CG13165-PA open reading frame 

TABLE V: Proteins found in the 4-orbit with nodes of de- 
gree 4, all duplicates (bold) are heat shock proteins (Hsp), 
released after heat shock or other stress. 



DIP ID 



Protein name/description 



DIP:19748N CG1252-PA open reading frame, Ccp84Ab, 
(cuticle cluster 7), 

structural constituent of larval cuticle, 
(sensu Insecta) 
DIP:19749N CG2360-PA open reading frame, Ccp84Aa, 
(cuticle cluster 8), 

structural constituent of larval cuticle, 
(sensu Insecta) 
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DIP:17392N CG9949-PA open reading frame, sina, 

(seven in absentia), 

sensory organ development 
DIP:18536N CG6615-PA open reading frame, scaf6, 

RNA binding, nuclear mRNA splicing 

via spliceosome, spliceosome complex 
DIP:20225N CG2341-PA open reading frame, Ccp84Ad, 

(cuticle cluster 5), 

structural constituent of larval cuticle, 
(sensu Insecta) 
DIP:17713N CG15422-PA open reading frame 



DIP:17492N CG12723-PA open reading frame 
DIP:17076N CG6945-PA open reading frame 
DIP:17488N CG11505-PB open reading frame 

TABLE VI: Proteins found in the 2-orbit with nodes of de- 
gree 7, duplicates (bold) are constituents of the larval cuticle. 
Note, that peb, Thdl, and CG13363-PA also form a 3-orbit 
with respect to Mst98Ca and Mst98Cb (cf. Tab. |TVJ. 
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